The first complete chloroplast genome sequences of Ulmus species by de novo sequencing: Genome comparative and taxonomic position analysis
نویسندگان
چکیده
Elm (Ulmus) has a long history of use as a high-quality heavy hardwood famous for its resistance to drought, cold, and salt. It grows in temperate, warm temperate, and subtropical regions. This is the first report of Ulmaceae chloroplast genomes by de novo sequencing. The Ulmus chloroplast genomes exhibited a typical quadripartite structure with two single-copy regions (long single copy [LSC] and short single copy [SSC] sections) separated by a pair of inverted repeats (IRs). The lengths of the chloroplast genomes from five Ulmus ranged from 158,953 to 159,453 bp, with the largest observed in Ulmus davidiana and the smallest in Ulmus laciniata. The genomes contained 137-145 protein-coding genes, of which Ulmus davidiana var. japonica and U. davidiana had the most and U. pumila had the fewest. The five Ulmus species exhibited different evolutionary routes, as some genes had been lost. In total, 18 genes contained introns, 13 of which (trnL-TAA+, trnL-TAA-, rpoC1-, rpl2-, ndhA-, ycf1, rps12-, rps12+, trnA-TGC+, trnA-TGC-, trnV-TAC-, trnI-GAT+, and trnI-GAT) were shared among all five species. The intron of ycf1 was the longest (5,675bp) while that of trnF-AAA was the smallest (53bp). All Ulmus species except U. davidiana exhibited the same degree of amplification in the IR region. To determine the phylogenetic positions of the Ulmus species, we performed phylogenetic analyses using common protein-coding genes in chloroplast sequences of 42 other species published in NCBI. The cluster results showed the closest plants to Ulmaceae were Moraceae and Cannabaceae, followed by Rosaceae. Ulmaceae and Moraceae both belonged to Urticales, and the chloroplast genome clustering results were consistent with their traditional taxonomy. The results strongly supported the position of Ulmaceae as a member of the order Urticales. In addition, we found a potential error in the traditional taxonomies of U. davidiana and U. davidiana var. japonica, which should be confirmed with a further analysis of their nuclear genomes. This study is the first report on Ulmus chloroplast genomes, which has significance for understanding photosynthesis, evolution, and chloroplast transgenic engineering.
منابع مشابه
Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species
Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...
متن کاملClustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملRibulose-1, 5-Bisphosphate Carboxylase/Oxygenase Gene Sequencing in Taxonomic Delineation of Padina Species in theNorthern Coast of the Persian Gulf, (IRAN)
Taxonomic study of the genus Padina (Dictyotales, Phaeophyceae) from the Persian Gulf coast was conducted based on morphology and molecular phylogenetic analyses using chloroplast encoded large subunit RuBisCo (rbcL) gene sequences. Detailed descriptions of each species found in this study are described. Several morphological characters, such as number of cell layers composing the thallus, pr...
متن کاملDe Novo Assembly of Complete Chloroplast Genomes from Non-model Species Based on a K-mer Frequency-Based Selection of Chloroplast Reads from Total DNA Sequences
Whole Genome Shotgun (WGS) sequences of plant species often contain an abundance of reads that are derived from the chloroplast genome. Up to now these reads have generally been identified and assembled into chloroplast genomes based on homology to chloroplasts from related species. This re-sequencing approach may select against structural differences between the genomes especially in non-model...
متن کاملLoss of Chloroplast trnLUAA Intron in Two Species of Hedysarum (Fabaceae): Evolutionary Implications
Previous studies have indicated that in all land plants examined to date, the chloroplast gene trnLUAA isinterrupted by a single group I intron ranging from 250 to over 1400 bp. The parasitic Epifagus virginiana haslost, however, the entire gene. We report that the intron is missing from the chloroplast genome of twoarctic species of the legume genus Hedysarum (H. alpinum, H. ...
متن کامل